Decision Key-Value Feature Construction for Multihoming Big Data Network

نویسندگان

چکیده

The random forest algorithm under the MapReduce framework has too many redundant and irrelevant features, low training feature information, parallelization efficiency when dealing with multihoming big data network problems, so parallelism is based on information theory, norms proposed for (PRFITN). In this paper, technique used first builds a hybrid dimensional reduction approach (DRIGFN) focused gain Frobenius norm, successfully reducing number of features; then, an theory offered. This results in dimensionality-reduced dataset. Finally, suggested Reduce stage. features are grouped FGSIT strategy, stratified sampling employed to assure quantity building decision tree forest. When datasets provided as key/value pairs, it common want aggregate statistics across all objects same key. To acquire global classification achieve rapid equal distribution key-value pair redistribution method (RSKP) used, which improves cluster’s parallel efficiency. provides superior impact large networks, particularly numerous characteristics, according experimental findings. We can utilize selection extraction together. addition minimizing overfitting redundancy, lowering dimensionality contributes improved human interpretation cheaper computing costs through model simplicity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cover Feature Big Data

Today’s applications often contain datasets that are too big to fit in a single computer’s main memory. Analyzing these massive datasets will require scalable and sophisticated machine-learning methods. Two commonly used approaches are stochastic optimization and inference algorithms,1 which process one data point at a time; and distributed computing based on the MapReduce framework,2 where the...

متن کامل

Application of Big Data Analytics in Power Distribution Network

Smart grid enhances optimization in generation, distribution and consumption of the electricity by integrating information and communication technologies into the grid. Today, utilities are moving towards smart grid applications, most common one being deployment of smart meters in advanced metering infrastructure, and the first technical challenge they face is the huge volume of data generated ...

متن کامل

Massively-Parallel Feature Selection for Big Data

We present the Parallel, Forward-Backward with Pruning (PFBP) algorithm for feature selection (FS) in Big Data settings (high dimensionality and/or sample size). To tackle the challenges of Big Data FS PFBP partitions the data matrix both in terms of rows (samples, training examples) as well as columns (features). By employing the concepts of p-values of conditional independence tests and meta-...

متن کامل

Key Technologies for Big Data Stream Computing

As a new trend for data-intensive computing, real-time stream computing is gaining significant attention in the Big Data era. In theory, stream computing is an effective way to support Big Data by providing extremely low-latency processing tools and massively parallel processing architectures in real-time data analysis. However, in most existing stream computing environments, how to efficiently...

متن کامل

Feature Selection in Structural Health Monitoring Big Data Using a Meta-Heuristic Optimization Algorithm

This paper focuses on the processing of structural health monitoring (SHM) big data. Extracted features of a structure are reduced using an optimization algorithm to find a minimal subset of salient features by removing noisy, irrelevant and redundant data. The PSO-Harmony algorithm is introduced for feature selection to enhance the capability of the proposed method for processing the measure...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Wireless Communications and Mobile Computing

سال: 2023

ISSN: ['1530-8669', '1530-8677']

DOI: https://doi.org/10.1155/2023/2977126